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Theory  and  Applications  of  Elliptically  Contoured 
and  Related  Distributions 

T.  W.  Anderson  and  Kai-Tai  Fang 


1.  Introduction. 

The  multivariate  normal  distribution  has  long  served  as  the  standard  model  for  the 
statistical  analysis  of  multivariate  observations.  Statisticians  have  been  interested  in  gen¬ 
eralizing  the  model  from  the  normal  population  to  a  wider  class  of  distributions  that  retain 
the  most  important  properties  of  the  multivariate  normal  distribution.  In  the  past  twenty 
years  it  has  been  found  that  the  class  of  elliptically  contoured  distributions  (ECD)  can  be 
regarded  as  a  suitable  extension  of  the  multivariate  normal  distribution.  The  class  of  ECD 
includes  many  multivariate  distributions,  such  as  the  multivariate  normal,  the  multivariate 
t,  the  multivariate  Cauchy,  the  multivariate  Laplace,  the  multivariate  uniform,  mixtures 
of  normal  distributions,  and  the  multivariate  stable  distributions.  Many  authors  have  de¬ 
veloped  the  theory  and  methods  of  statistical  inference  for  the  ECD.  Survey  papers  have 
been  published  by  Muirhead  [70],  Chmielewski  [19],  and  Fang  [37]. 

The  purpose  of  this  paper  is  to  introduce  the  contributions  of  theory  and  applications 
of  ECD  and  related  distributions,  mainly  by  Chinese  statisticians.  When  the  second  author 
visited  Stanford  University  in  the  academic  year  1981-82  to  pursue  research,  the  first 
author  suggested  ECD  as  furnishing  a  fruitful  area  of  :nvestigation;  they  cooperated  in 
this  venture.  Upon  his  return  to  China  the  second  author  directed  his  doctoral  students 
in  conducting  research  on  this  subject.  Most  of  the  papers  were  originally  published  in 
Chinese  journals  and  collected  in  the  volume  [39]  in  English.  Under  the  influence  of  this 
work  a  number  of  Chinese  authors  entered  this  area  and  made  valuable  contributions 
as  listed  in  the  references.  We  regret  any  omission  of  major  contributions  dae  to  the 
limitations  of  our  survey. 

There  are  several  ways  to  define  ECD  and  its  standard  form,  spherical  distributions 
(SD),  by  using  different  properties  of  the  normal  distribution.  (Sec,  for  example,  the 
preface  of  [39]  and  Section  1.1  of  [49].)  One  is  the  following.  The  random  vector  X  has 
the  distribution  N(n,  X)  if  and  only  if 

X±p  +  AY, 

where  AA'  =  X  and  Y  has  the  standard  normal  N{0, 1).  Here  =  denotes  that  the  two 
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sides  of  the  equality  have  the  same  distribution.  For  this  kind  of  definition  of  ECD  we 
define  the  spherical  distribution  first. 

Definition  1.1.  An  n  x  1  random  vector  X  is  said  to  have  a  spherical  distribution  if 
for  each  Q  6  0(n ) 

QX  i  X,  (1.1) 

where  O(n)  denotes  the  set  ofnxn  orthogonal  matrices. 

The  following  theorem  gives  some  equivalent  definitions  of  SD. 

Theorem  1.1.  Let  X  be  an  n  x  1  random  vector.  Then  the  following  statements  are 
equivalent: 

1)  QX  =  for  each  Q  €  O(n); 

2)  The  c.f.  of  X ,  Eel*  ^ ,  is  a  function  of  t't,  t  6  Rn; 

3)  X  has  a  stochastic  representation 

X  =  RU(n)  (1.2) 

for  some  R  >  0,  where  R  is  independent  of  U^n'  and  the  latter  is  uniformly  distributed  on 
the  unit  sphere  in  Rn] 

4)  For  any  a  G  Rn  we  have 

a'X  =  ||a|jXi,  (1.3) 

where  ||a||  is  the  Euclidean  norm  and  X\  is  the  first  component  of  X. 

From  part  2)  of  the  theorem  the  c.f.  of  a  SD  has  the  form  <f>(t't),  where  <f>(-)  is  a  scalar 
function.  Therefore,  we  write  X  ~  Sn(<f>).  The  set  of  all  possible  <£’s  is  denoted  by  4>„; 
that  is, 

$n  =  H - +  f*)  is  an  n-dimensional  c.f.}.  (1.4) 

The  probability  method  that  treats  models  directly  with  random  variates  rather  than 
their  distribution  function  or  c.f.  plays  an  important  role  in  theory  of  ECD.  In  particular, 
Anderson  and  Fang  [2,3]  gave  a  systematic  discussion  of  the  =  operator.  Many  results 
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mentioned  in  this  paper  were  obtained  by  the  probability  method  and  show  that  the 
=  operator  is  a  powerful  tool.  The  fact  is  true  in  [13],  [14],  and  Zolotarev’s  book  [91]. 
Therefore,  the  stochastic  representation  (1.2)  is  one  of  the  most  important  properties 
of  SD  which  shows  the  following  properties:  (a)  The  set  of  SD’s  is  equivalent  to  that  of 
the  set  of  nonnegative  random  variables,  (b)  The  SD  is  essentially  a  function  of  a  random 
variable  R.  (c)  X/||X||  and  ||X|[  are  independent,  and  R  =  |(X||  and  =  X/j|X||. 
(d)  X  has  a  density  which  is  of  the  form  g(x'x )  if  and  only  if  R  has  the  density 


(In  this  case  we  prefer  to  write  X  ~  Sn(g )  instead  of  X  ~  Sn(<f> )  and  g  is  called  the 
density  generating  function  [49].)  (e)  Let  t(X)  be  a  statistic  satisfying  t(aX)  =  t(X)  for 
any  a  >  0;  if  P(X  =  0)  =  0,  then  /(X)  =  t(Z)  where  Z  ~  N(Q,I),  i.e.,  the  distribution 
of  t(X)  is  invariant  in  the  class;  for  instance,  the  /-statistic  has  the  same  distribution  for 
all  members  of  the  class,  (f )  The  marginal  distribution  of  X\ , . . . ,  Xm  is  a  SD  again  which 
has  the  stochastic  representation  (1.2)  with  m  instead  of  n  and  RB  instead  of  R,  where 
B  >  0,  B  ~  B(m/2,  (n  —  m)/2),  and  R,  B,  and  U are  independent. 


Definition  1.2.  An  nxl  random  vector  X  is  said  to  have  an  elliptically  contoured 
distribution  (ECD)  with  parameters  fi  and  X  (n  x  n)  if 

X±m  +  AY,  Y~Sk(</>),  (1.6) 

where  A  :  n  x  k  and  AA'  =  X  with  rank(27)  =  k.  We  write  X  EC(n,X,<f>). 

Many  properties  of  ECD  can  be  transferred  from  those  of  SD  by  means  of  (1.6).  The 
following  properties  are  important  and  are  needed  in  this  paper. 

1)  A  linear  transformation  of  an  ECD  is  again  an  ECD;  in  particular,  all  marginal 
distributions  of  an  ECD  are  ECD. 

2)  All  conditional  distributions  of  an  ECD  sure  ECD. 

3)  The  c.f.  of  ECn($x,  X,  <f>)  is  exp(t t' n)4>(t! Xt). 

4)  X  ~  ECn($i,  X,  <f> )  with  rank(27)  =  k  if  and  only  if 

X  =  fi  +  RAU{k\  (1.7) 


where  R  >  0  is  independent  of  U^k\  A  :  n  x  k,  and  AA '  =  X. 


5)  If  Y  has  the  density  g(x'x)  and  A  is  square  and  nonsingular,  then  X  =  AY  has 
the  density 

I  (1.8) 

where  AA!  —  17,  and  is  denoted  by  X  ~  ECn(fj.,  17,$).  The  contours  of  constant  density 
are  ellipsoids 

(x  —  /i)'27-1(a:  —  /i)  =  const. 

This  fact  leads  to  the  name  of  ECD.  There  are  various  other  terms  used,  such  as  round 
distribution,  isotropic  distribution  [24]  for  SD,  and  ellipsoidal  symmetric  distribution  in 
the  literature. 

More  properties  and  detailed  discussions  are  referred  to  [49], 

This  paper  is  organized  as  follows.  Several  types  of  spherical  and  elliptical  matrix 
distributions  and  their  relationships  axe  discussed  in  Section  2.  The  distributions  of  their 
quadratic  forms  and  associated  Cochran’s  theorem  are  presented  there  as  well.  Some  results 
of  estimation  of  parameters  of  and  testing  hypotheses  about  ECD  are  given  in  Sections  3 
and  4,  respectively.  The  stochastic  representation  (1.2)  and  (1.6)  gives  the  structure  of  SD 
and  ECD.  The  same  idea  can  be  applied  to  some  other  distributions  and  produces  other 
classes  of  symmetric  multivariate  distributions;  a  summary  constitutes  Section  6.  Section 
5  collects  applications  of  ECD  models  in  regression  analysis,  principal  component  analysis, 
canonical  correlation  analysis,  discriminant  analysis,  and  econometrics.  The  last  section 
consists  of  miscellaneous  results. 

2.  Classes  of  Distributions  and  Distributions  of  Quadratic  Forms 

A  sample  of  n  observations  from  a  multivariate  distribution  X(i), . . . ,  Af(n)  can  be 
expressed  by  an  n  x  p  matrix 

(X"\ 

X=  :  =(Xi,. ..,*,).  (2.1) 

U<v 

This  matrix  of  observations  is  the  basis  of  multivariate  analysis  and  data  analysis.  There¬ 
fore,  we  study  its  distribution  first.  If  the  observation  vectors  are  drawn  independently 
from  N(fi,  17),  then  the  matrix  X  has  a  matrix  normal  distribution  NnXj,(M,  I®E)  with 
the  c.f. 

V’(r)  =  £[exp(:tr(T'X))]  (2.2) 

=  exp(i  tr  T'M)exp(—  ^  tr  17 T'T 


4 


where  M  =  1/i',  and 

^  =  (^(1  )>---»^(n))  =  (^1*  •  •  •  »^p)-  (2.3) 

When  /i  =  0,  ^(T)  is  a  function  of  T'T  and  is  invariant  under  n  x  n  orthogonal  transfor¬ 
mations. 

When  the  parent  distribution  is  more  generally  ECp(fi ,  X,  <f>),  the  c.f.  of  X  is 

E(eitirx) = <2-4) 

j=i 

where  Af  is  the  same  matrix  as  in  (2.2).  Unfortunately,  most  results  for  this  model  are 
based  on  asymptotic  theory  and  numerical  evaluation  with  the  exception  of  X  /V  Np(n,E). 
(See  [69]  and  [70].) 

An  alternative  model  for  random  X  is  that  the  columns  of  X  are  uncorrelated  and  each 
has  mean  n  and  the  covariance  matrix  X.  This  model  generates  various  spherical/elliptical 
matrix  distributions. 

Corresponding  to  the  invariance  of  (1.1)  Dawid  [20-22]  proposed  two  classes  of  spher¬ 
ical  matrix  distributions  (SMD). 

Definition  2.1.  Let  X  be  an  nxp  random  matrix.  If  QX  =  X  for  every  Q  €  0{n)  we 
call  X  left-spherical  and  write  X  6  LS.  If  X  and  X'  are  both  LS  we  call  X  symmetrically 
spherical  and  write  X  €  SS. 

In  terms  of  the  c.f.  Anderson  and  Fang  [3]  suggested  the  following. 

Definition  2.2.  An  nxp  random  matrix  X  is  said  to  have  a  multivariate  spherical 
distribution  if  the  c.f.  of  X  has  the  form  ^(tjti,^^  •  •  • , t'ptp)  and  is  denoted  X  €  MS 
or  X  ~  MSnXp(<f>). 

The  most  direct  extension  of  spherical  distribution  to  the  matrix  case  is  by  means  of 
the  vector  operator  vec(-),  defined  as 

vec(X)  =  (2.5) 

and  considered  by  many  authors,  such  as  Kariya  [64],  Jensen  and  Good  [62],  Fraser  and 
Ng  [59],  and  Anderson  and  Fang  [3,4]. 
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Definition  2.3.  Let  X  be  an  n  x  p  random  matrix.  If  vec(X)  is  spherical,  we  call 
X  vector-spherical  and  write  X  €  VS. 

The  above  four  classes  of  spherical  matrix  distributions  have  been  studied  individually 
by  the  above  authors.  Fang  and  Chen  [42,43]  established  relationships  among  them  and 
found  more  properties  as  follows.  (They  used  T\,  Ti,  and  T,  to  denote  the  classes 
LS,  MS,  VS,  and  SS,  respectively). 

Theorem  2.1.  The  c.f.  of  X  has  the  form 

<f>(T'T),  if  XeLS, 

<£[diag(T'T)],  iiXeMS, 

_  (2.6) 

<p[tr (T'T)],  if  XeVS, 

<t>[eig (T'T)],  if  X  eSS, 

where  diag(>l)  =  (an, . . . , app)  and  eig(A)  =  the  vector  eigenvalues  of  A.  As  a  sequel  we 
have 

VSCMSCLS  and  VS  C  SS  C  LS.  (2.7) 

Furthermore,  VS  =  MS  n  SS. 

In  the  following  exposition  X  ~  LS(<f> )  denotes  that  X  €  LS  and  the  c.f.  of  X  is 
4>{T'T),  with  similar  notations  for  the  other  cases. 

Theorem  2.2.  If  the  nxp  random  matrix  X  has  one  of  the  spherical  matrix  distri¬ 
butions,  then  it  has  one  of  the  following  stochastic  representations: 

LS:  X  =  U\A,  where  U\  :  n  x  p,  A  :  p  x  p,  U\  G  LS,  U[U\  =  Ip,  A' A  =  X'X,  and 
A  and  U\  are  independent; 

MS:  X  =  U?R,  where  U?  has  i.i.d.  columns,  each  distributed  as  U^,  and  R  = 
diag(i?i, . . . ,  Rp)  >  0  is  independent  of  C/2; 

SS:  X  =  U\AV  is  the  singular  value  decomposition,  where  U\,  A,  and  V  are  in¬ 
dependent,  U\  is  the  same  as  in  LS,  V'  €  LS,  V'V  =  Ip,  and  A  is  a  diagonal 
matrix  with  nonnegative  elements, 

VS:  X  =  RU 3 ,  where  R  >  0  is  independent  of  C/3  and  vec(Us)  =  U^np\ 
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Each  of  the  distributions  of  U\,  C/2,  and  C/3  is  called  the  uniform  matrix  distribution 
with  its  respective  specific  meaning.  Furthermore,  they  have  the  stochastic  representations 

u, 1  =  U 2  *=  (Yj/WYjW'j  =  U3  =  Y/(tiY'Y)'/2,  (2.8) 

where  Y  =  (Yj, . . . ,  Yp)  has  the  standard  matrix  normal  distribution  N( 0,  Jn  x  Ip). 

The  c.f.’s  of  C/2  and  C/3  can  be  found  by  the  result  of  Schoenberg  [75].  Zhang  and 
Fang  [89]  obtained  an  expression  of  the  c.f.  of  U\  in  terms  of  the  hypergeometric  function. 

With  the  stochastic  representations  given  by  Theorem  2.2  many  results  can  be  trans¬ 
ferred  from  multivariate  normal  populations  to  these  wider  classes.  A  number  of  authors, 
such  as  Dawid  [20],  Chmielewski  [18],  Fraser  and  Ng  [59],  Jensen  and  Good  [62],  and 
Anderson  and  Fang  [3],  found  invariant  statistics  in  these  classes.  Fang  and  Chen  [42] 
obtained  necessary  and  sufficient  conditions  for  invariant  statistics  in  the  four  classes.  For 
simplicity,  we  cite  only  the  theorem  in  the  LS  case.  Let 

LS+  =  {X  :X  eLS  and  P(  X'X  >  0)  =  1 }  (2.9) 

Theorem  2.3.  Let  <(2l)  be  a  statistic.  Then  the  distribution  of  t(X)  is  invariant 
in  LS+  if  and  only  if  t(XA)  =  t(X)  for  each  A  €  UT,  the  set  of  p  x  p  upper  triangular 
matrices  with  positive  diagonal  elements. 

As  an  application  of  Theorem  2.3,  one  can  find  many  useful  statistics  (such  as  the 
Wilks  statistic  and  the  Hotelling  T2,  and  the  statistic  for  testing  equality  of  severed  covari¬ 
ance  matrices)  that  axe  invariant  in  LS+.  This  fact  shows  some  overwhelming  advantages 
of  spherical  matrix  distributions  and  gives  the  possiblity  of  extending  the  multivariate 
analysis  techniques  into  these  wider  classes.  For  more  details  see  Section  6. 

A  random  matrix  X  with  an  elliptical  matrix  distribution  (EMD)  is  the  linear  trans¬ 
form 

X  =  M  +  YA,  (2.10) 

where  Y  has  a  spherical  matrix  distribution  in  any  of  the  above  classes  and  Af  and  A 
are  constant  matrices.  Thus  we  have  four  classes  of  elliptical  matrix  distributions;  we 
denote  them  by  LE,  SE ,  ME ,  and  VE,  respectively.  Zhang,  Fang,  and  Chen  [90]  gave  a 
comprehensive  study  of  these  classes.  They  found  margined  and  conditional  distributions, 
stochastic  decompositions,  moments,  and  invariemt  statistics.  It  should  be  noted  that  the 
matrix  normal  X  with  distribution  JV(0, 1  ®  27)  is  a  member  of  LS,  but  the  rows  are  not 
necessarily  spherical. 
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The  distributions  of  quadratic  forms  and  Cochran’s  theorem  play  an  important  role  in 
multivariate  analysis.  Let  the  n  x  p  random  matrix  X  in  LS  be  partitioned  into  m  parts 
X\ , . . . ,  Xm  with  rii , . . . ,  nm  rows,  respectively.  When  p  =  1  and  X  has  a  density,  Kelker 
[65]  obtained  the  distribution  of  X[X\.  Anderson  and  Fang  [2]  derived  the  distribution 
of  XjXj,  j  =  1, . . . ,  m,  without  the  assumption  of  X  having  a  density.  As  a  sequel,  they 
[3]  obtained  the  distributions  of  the  sample  covariance  matrix,  the  correlation  matrix,  the 
multiple  correlation  coefficient,  the  generalized  variance,  the  eigenvalues  of  the  sample 
covariance  matrix,  etc.  Fang  and  Wu  [57]  extended  their  results  to  the  case  of  LE  with 
M  =  0  in  (2.10).  When  M  ^  0,  Teng,  Fang,  and  Deng  [77]  obtained  the  density  of  X[X\ 
under  some  regularity  conditions,  thus  extending  the  result  of  Cacoullos  and  Koutr-  s  [11] 
for  p  =  1.  Fan  [25]  obtained  the  noncentral  t-,  F— ,  and  redistributions  by  using  the 
method  of  [11]  and  gave  a  detailed  discussion  of  the  distributions. 

Let  X  ~  N(lfi' ,In  ®  Ip).  The  basic  features  of  Cochran’s  theorem  can  be  formulated 
as  follows: 

1)  X'  AX  ~  xldt1'  Aft)  (the  noncentral  chi-square  distribution  with  k  degrees  freedom 
and  noncentrality  parameter  fi' Afi)  if  and  only  if  A2  =  A  and  rank(A)  =  fc; 

2)  X'  AX  and  X'BX  are  independent  if  and  only  if  AB  =  0.  We  shall  call  the  result 
the  central  Cochran’s  theorem  if  /*  =  0;  otherwise  we  shall  call  it  the  noncentral  Cochran’s 
theorem.  Anderson  and  Styan  [6]  reviewed  various  extensions  of  Cochran’s  theorem  for 
the  normal  case. 

We  would  here  like  to  mention  several  contributions  to  Cochran’s  theorem  for  ECD 
and  LS.  Kelker  [65]  extended  the  central  Cochran’s  theorem  to  ECD  under  the  condition 
that  X  has  a  density  with  finite  fourth  moments.  Anderson  and  Fang  [2,3],  using  the  = 
operator,  gave  a  new  approach  to  various  exte  isions  of  Cochran’s  theorem  in  ECD  without 
the  condition  of  Kelker.  Fang  and  Wu  [57]  extended  their  results  to  more  general  quadratic 
forms.  Due  to  the  need  in  the  theory  of  multivariate  analysis,  Fang,  Fan,  and  Xu  [45] 
extended  the  results  in  [4,5]  to  the  case  where  the  matrix  A  is  random  and  gave  some 
applications  to  T2-  and  Wilks  statistics  and  Tukey  testing. 

The  noncentral  case  of  Cochran’s  theorem  is  much  more  difficult  to  handle  than  the 
central  case.  Thus  the  results  for  ECD  are  not  as  extensive  in  the  noncentral  case  as  in  the 
central  case.  Under  the  assumption  of  finite  fourth  moments  Fan  [26]  proved  the  noncentral 
Cochran’s  theorem  using  c.f.’s.  Zhang  [85]  extended  the  results  of  Fang  and  Wu  [57]  to 
the  noncentral  situation,  as  well  as  the  result  of  Fan  [26],  but  under  the  condition  of  finite 
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2nth  moment. 


3.  Estimation  of  Parameters  of  Elliptically  Contoured  Distributions. 

Estimation  theory  for  u.  normal  distribution  is  highly  developed.  Let  X\ , . . . ,  Xn  be 
a  sample  of  independent  observation r  from  jVp(/x,  X).  The  maximum  likelihood  estimators 
of  fJ.  and  X  are  the  sample  mean  and  to.,  •’ample  covariance  matrix 

X  =££•*•  ~X)(Xi~X)\  (3.1) 

1=1  1=1 

respectively.  Estimation  in  elliptical  populations  can  be  estate  hed  in  parallel  fashion. 

Let  the  matrix  of  observations  X  have  an  elliptical  matrix  dis>.  'bution  (2.10)  with 
Af  =  1  n'  and  AA'  =  X.  We  want  to  estimate  the  parameters  fi  and  X.  If  X  has  a 
density  and  X  €  LE ,  the  density  must  have  the  form 

\E\-p/2g  [(X  -  M)'E~\X  -  M)\ .  (o.X 

When  X  is  from  ME,VE ,  or  SE,  the  density  of  X  has  the  same  form  (3.2)  with  g  [  diag(-)] , 
<7  [  tr(-)] ,  and  g[eig(-)],  respectively.  We  shall  write  X  ~  LE(n,  E,g),  X  ~  ME(u,  E,g), 
and  so  on  for  these  models.  In  this  section  some  results  on  maximum  likelihood  estimates 
(MLE),  minimax  estimates,  shrinkage  estimates,  and  inadmissibility  of  the  sample  mean 
are  mentioned. 

For  X  ~  VE(n,  X,  g)  Anderson  and  Fang  [4]  developed  a  new  approach  to  the  MLE’s 
of  fi  and  X.  The  MLE’s  are 

M  =  X,  and  E  =  ygS  =  yg(X -iX*)'(X -IX4),  (3.3) 

where  the  constant  yg  will  be  given  in  Lemma  3.1  below.  Later  Anderson,  Fang,  and  Hsu 
[5]  established  the  relationship  of  the  MLE’s  in  normal  and  elliptical  models  and  therefore 
gave  a  unified  approach  to  MLE  for  EV .  Their  main  result  is  the  following: 

Theorem  3.1.  Let  fi  be  a  set  in  the  space  of  (|i,  V),  V  >  0,  such  that  if  (/x,  V)  6  fi 
then  (/x,  cV)  €  fi  for  all  c  >  0.  Suppose  g  is  such  that  g(x'x)  is  a  density  in  RN  and 
yN/2g(y)  has  a  finite  positive  maximum  yg.  Suppose  that  on  the  basis  of  an  observation 
X  from  |V|_1^2^[(*  —  |i),Vr~1(*  —  jx)]  the  MLE’s  uni.  jt  normality  (/x,  V)  6  fi  exist  and 
are  unique  and  that  V  >  0  with  probability  1.  Then  the  MLE’s  for  g  axe 

#x  =  A,  V  =  (%)V, 
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and  the  maximum  of  the  likelihood  is  |V| ~l^2g{yg)- 

The  existence  of  yg  mentioned  in  the  theorem  may  be  based  on  the  following  lemma. 

Lemma  3.1.  Suppose  that  g(x'x)  is  a  density  in  a;  €  RN  such  that  g(y)  is  continuous 
and  decreasing  for  y  sufficiently  large.  Then  the  function 

Hv)  =  yN/2g(y ),  y>o, 

has  a  maximum  at  some  finite  yg  >  0.  An  alternative  condition  is  that  g  is  continuous  and 
E(X'X)  <  oo. 

Fang  and  Xu  [51]  and  Fang,  Xu,  and  Teng  [54]  extended  the  above  results  to  the  case 
of  ES,  EM,  and  EL,  respectively.  Since  the  MLE  of  a  function  of  /x  and  27  is  that  same 
function  of  the  MLE’s  /x  and  27,  we  thus  obtained  the  MLE’s  for  the  most  useful  statistics 
in  multivariate  analysis. 

The  usual  estimator  of  fjt  in  the  normal  population,  namely  the  sample  mean,  is 
inadmissible  under  a  quadratic  loss  if  the  dimension  of  the  observations  is  greater  than  2; 
this  result  is  due  to  Stein  [76].  After  improvement  of  the  original  proof,  several  concise 
proofs  have  been  proposed;  see  to  Anderson  [1],  for  example.  Among  the  many  papers  on 
this  topic,  Brandwein  and  Strawderman  [10]  established  the  inadmissibility  of  the  sample 
mean  for  spherical  distributions  when  the  dimension  is  greater  than  3.  Their  proof  is 
very  long  in  comparison  to  the  concise  proof  for  normal  case  given  in  [1].  Fan  and  Fang 
[31]  have  given  an  improved  proof  which  is  much  shorter  them  the  original  one  and  the 
conditions  are  weaker.  Let  X  have  an  elliptical  matrix  distribution  LE(n,  27,  g).  It  is  easy 
to  see  that  ( X,S )  is  a  sufficient  statistic  for  (/x,  2 7)  by  the  Fisher-Neyman  factorization 
theorem.  Therefore,  the  inadmissibility  of  the  mean  can  be  expressed  in  the  following 
simple  statement. 

Theorem  3.2.  Suppose  that  X  ~  ECp(n,Ip,4>)-,  that  is,  X  -  n  is  spherical.  Then 
the  estimate 

«„(X)  =  (l-a/||X||J)X  (3.4) 

is  better  than  the  usual  estimate  X  under  quadratic  loss,  provided  that  p  >  3  and 

2(p  -  3) 

0-a-(p-l)£7o||X|r2’ 
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where  .Eo||.X’||  2  is  the  expected  value  of  ||X|j  2  when  /x  =  0. 

This  result  can  be  extended  to  the  case  where  the  loss  has  the  form  W  [(5  — 
and  W(-)  is  a  nonnegative  convex  function.  Furthermore,  the  estimator 

*.,/(•*)  =  (1  -  o/(||X||2)/|m|2) X,  (3.6) 

where  0  <  /(x)  <  1,  /(x)  is  nondecreasing,  /(x)/x  is  nonincreasing  for  x  >  0,  and 
f"(x)  <  0  for  x  >  0  [31]  and  a  satisfies  (3.5),  also  dominates  X.  Note  that  for  the 
inadmissibility  of  the  sample  mean  under  quadratic  loss  the  condition  of  p  >  2  in  the 
normal  case  becomes  that  of  p  >  3  in  the  spherical  case. 

We  now  consider  minimax  estimates  of  (i.  Let  X  have  a  distribution  VE(fi,E,g), 
where  gr(-)  is  a  nonnegative  decreasing  function.  Let  W(-)  be  a  nonnegative  increasing 
function.  Fan  and  Fang  [29]  pointed  out  that  under  the  loss  W[(rf—  jt)'2? ~1(d  —  /*)],  the 
sample  mean  X  is  a  minimax  estimate  for  fi.  Furthermore,  they  found  that  if  Xj , . . . ,  X„ 
are  independently  drawn  from  ECp(n,  J,  g),  then  under  loss  function  W(\\d— /x|] )  the  mean 
X  is  a  minimax  estimate  in  the  class  of  {h(X)  :  h(-)  a  real  function}.  Some  sequential 
minimax  properties  for  the  sample  mean  and  Stein’s  two-stage  estimate  are  also  discussed 
in  [30],  In  fact,  we  can  find  a  wider  class  of  minimax  estimates  of  /x,  such  as  6„(X)  in 
(3.4)  and  6a,f(X )  in  (3.6)  in  a  certain  sense.  (See  [30].)  The  estimates  (3.4)  and  (3.6)  are 
shrinkage  estimates. 

The  reader  is  referred  to  Section  4.4.2  of  [58]  for  further  discussion. 

4.  Testing  Hypotheses  about  Elliptically  Contoured  Distributions. 

Let  the  matrix  of  observations  X  have  an  elliptical  matrix  distribution  LE(n,E,g), 
where  (jx,  E)  €  ft,  the  parameter  space.  We  want  to  test 

H0  :  (ft,  E)  e  u  vs.  H\  :  (/i,  E)  €  ft/u>.  (4.1) 

Statistics  for  testing  (4.1)  can  be  derived  by  different  principles,  among  which  is  the  like¬ 
lihood  ratio  criteria  (LRC).  From  (3.2)  the  likelihood  function  is 

Up,E)  =  irrn/!«[(r  -  \y.')’B-\X  -  1m')].  (4-2) 

Hence,  the  LRC  of  testing  (4.1)  is 

T(X)  =  max  £(/x,  E)f  max  L(/i,  E).  (4.3) 

u  n 
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When  X  6  EV,  Anderson  and  Fang  [4]  obtained  many  statistics  used  in  multivariate 
analysis,  such  as  the  criterion  for  testing  lack  of  correlation  between  sets  of  variates,  testing 
the  hypothesis  that  a  mean  vector  is  equal  to  a  given  vector,  testing  equality  of  several 
covariance  matrices,  testing  equality  of  several  means,  etc.,  and  found  that  these  statistics 
have  the  same  form  and  the  same  null  distribution  in  these  distributions  as  in  the  normal 
distribution.  With  Theorem  3.1  Anderson,  Fang,  and  Hsu  [5]  gave  a  unified  approach 
to  LRC’s  and  established  the  relationship  of  distributions  of  the  LRC  between  normal 
and  other  elliptical  populations.  Fang  and  Xu  [51]  and  Fang,  Xu  and  Teng  [54]  extended 
systematically  the  results  to  the  wider  classes  ME,  SE,  and  LE.  They  found  that  there 
are  some  statistics  (but  not  all  of  those  in  VE)  that  have  the  same  form  and  the  same 
distribution  within  the  entire  class.  Chmielewski  [18]  studied  invariant  statistics  for  testing 
equality  of  fc  covariance  matrices.  Chen  [16]  pointed  out  that  the  invariants  obtained  in 
[18]  are  correct  only  for  k  =  2  and  gave  the  correct  invariant  statistic  for  arbitrary  k. 

A  necessary  and  sufficient  condition  for  a  statistic  to  be  invariant  in  classes  of  elliptical 
matrix  distributions  can  be  obtained  as  in  Theorem  2.3.  Kariya  [64]  gave  an  alternative 
necessary  and  sufficient  condition,  but  Bian,  Wang,  and  Zhang  [8]  found  that  there  is  a 
gap  in  the  Kariya’s  proof.  They  gave  a  counter-example  to  his  result,  but  proved  that  if 
the  matrix  of  observations  has  a  density,  then  Kariya’s  theorem  is  true. 

Although  the  null  distribution  of  an  invariant  statistic  is  the  same  for  all  elements  of 
the  class,  the  nonnull  distribution  depends  on  the  specific  element  of  the  class.  That 
consideration  leads  to  derivations  of  noncentral  distributions.  (See  [25],  [36],  [77].)  Let 
X  ~  ECn(n,In,4>).  Define  the  sample  mean  and  standard  deviation  by 

x  =  ±l'X,  .2  =  ix'(j„-ill,').X\ 

n  n  \  n  J 

Fang  and  Yuan  [56]  studied  the  power  of  the  f-test  in  the  class  of  ECn(fx,  I,  <f>)  and  found 
that  the  power  can  be  very  different  for  the  different  elements  of  class.  They  furthermore 
pointed  out  the  following. 


Theorem  4.1.  Let 


X1' 

d_ 

RiUW 

x2. 

4- 

R2U{n) 

(4.4) 


Let  ti  =  y/n  Xi/ &i,  where  X,  and  s,-  are  the  sample  mean  and  standard  deviation  of  X,, 
and  let  d(x,y)  =  |x  —  y|  be  the  Li-norm  distance.  Then 


Ed{t\,t2)  —  cEd(  l/Ri,  1  /  R-2 ) , 


(4.5) 
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where  c  is  a  known  constant. 

Then  Ed(ti,t2 )  can  be  very  large  if  Ed(l/Ri,  I/R2)  is  very  large.  With  this  theorem 
Fang  and  Yuan  [56]  obtained  the  limiting  distribution  of  the  ^-statistic  in  some  subclasses 
of  ECn(fi,I,g).  The  convergence  of  the  statistic  is  not  only  in  distribution,  but  also  in 
density.  The  same  approach  can  be  applied  to  the  F-statistic,  T2 -statistic,  and  so  on. 

The  invariance  of  a  statistic  in  the  class  of  elliptical  distributions  can  be  employed  for 
enlarging  the  class.  For  example,  let 

Ft  =  {X  :  X  is  exchangeable  and  tx  ~  <n-i}  (4-6) 

be  a  set  of  n-dimensional  random  vectors  such  that  the  corresponding  f-statistic  has  the 
same  distribution  as  in  the  normal  case.  (X  is  exchangeable  if  X  =  PX  for  every 
permutation  matrix  P .)  Obviously,  the  set  SD  belongs  to  Ft.  In  fact,  the  class  Ft  is 
much  larger  than  SD.  More  precisely,  let  VT  denote  the  class  of  X  that  has  the  stochastic 
decomposition  (1.2)  without  necessarily  independence  of  R  and  U^n\  Then  SDC  VT  C  Ft. 
This  class  can  serve  for  deriving  Baysian  statistics,  but  its  structure  has  not  yet  been 
sufficiently  investigated. 

Many  LRC’s  in  normal  populations  yield  uniformly  most  powerful  (UMP)  and  unbi¬ 
ased  tests.  Do  those  tests  retain  their  optimal  properties  in  elliptical  populations.  Quan 
[72]  and  Quan  and  Fang  [73]  investigated  this  subject  for  VE  and  found  that  many  LRC’s 
keep  these  properties  as  follows:  Let  X  ~  YF(/x,  27,  g). 

1)  Partition  /i  into  two  subvectors  Hi  and  H2-  Consider  testing 

H0  :  Hi  -  P2  =  0  vs.  Hi  :  Hi  ^  0,  P2  =  0.  (4.7) 

If  the  density  generating  function  g(-)  is  monotonically  decreasing  and  differentiable  and 
g'(-)  is  increasing,  then  the  LRC  test  for  (4.7)  is  UMP  in  the  class  of  tests  based  on  the 
likelihood  ratio  statistic. 

2)  Let  R  be  the  population  multiple  correlation  in  V E(h ,  27,  <7)  and  let  R  be  the  sample 
multiple  correlation.  If  g  satisfies  the  conditions  in  1),  then  the  LRC  for  Ho  :  R  =  0  is 
UMP  invariant. 

3)  The  Wilks  statistic  and  the  statistics  for  testing  lack  of  correlation  between  sets  of 
variates,  testing  equality  of  several  covariance  matrices,  testing  equality  of  several  mean 
vectors  and  covariance  matrices  simultaneously,  and  the  sphericity  test  are  unbiased  if  g 
is  decreasing. 
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The  goodness  of  fit  test  for  elliptical  symmetry  is  a  difficult  problem.  Deng  [23] 
proposed  a  significance  test  for  elliptical  symmetry  by  use  of  moment  sequence.  It  is 
evident  that  his  method  requires  all  moments  to  be  finite. 

5.  Applications. 

Application  of  the  established  theory  of  ECD  and  EMD  shows  that  many  well-known 
techniques  of  multivariate  analysis,  such  as  regression  analysis,  multivariate  analysis  of 
variance,  principal  component  analysis,  canonical  correlation  analysis,  discriminant  anal¬ 
ysis  and  econometric  methods  are  valid  in  these  wider  classes. 

Consider  the  general  regression  model 

Y  =  f(X,B)  +  E,  (5.1) 

where  Y  :  n  x  p,  X  :  n  x  q,  E  :  n  x  p,  B  :  p  x  q,  E  has  a  spherical  matrix  distribution  and 
B  is  the  matrix  of  undetermined  regression  coefficients.  When 

f(X,B)  =  XB,  (5.2) 

(5.1)  is  the  linear  model.  The  least  squares  estimate  (LSE)  of  B  has  the  same  form  in 
general 

B  =  (X'X)-X'Y  (5.3) 

as  in  the  case  of  E  having  a  normal  distribution.  Here  ( X'X)~  denotes  a  generalized 
inverse  of  X'X.  When  E  ~  VS(0,  27,  g)  and  g  is  a  decreasing  function,  Anderson  and 
Fang  [3]  obtained  the  maximum  likelihood  estimates  of  B  and  27,  and  their  distribution. 
These  results  extended  some  pioneer  work  of  Box,  Thomas,  and  Zellner  mentioned  in  [19]. 
Bian  and  Zhang  [9]  and  Fang,  Xu  and  Teng  [54]  obtained  similar  results  in  the  classes  MS, 
SS,  and  LS  and  gave  invariant  statistics  of  testing  some  hypotheses  about  B.  Combining 
the  above  results  with  distributions  of  quadratic  forms  and  Cochran’s  theorem  for  ECD 
and  EMS,  we  have  systematically  established  the  theory  and  methods  of  linear  models  for 
ECD  and  EMS. 

Fan  [27]  and  Fan  and  Fang  [29]  discussed  shrinkage  estimates,  ridge  regression,  and 
inadmissibility  of  estimators  of  regression  coefficients  for  ECD  and  EMS.  Lin  and  Gong  [68] 
considered  two  seemingly  unrelated  regression  models  under  some  regularity  conditions. 
They  gave  the  small  sample  properties  of  Zellner’s  estimator  when  the  disturbances  have 
ECD’s.  Pan  [71]  obtained  the  LSE  for  the  growth  curve  model  and  some  related  invariant 
statistics  for  ECD. 
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When  f(X ,  B)  is  a  nonlinear  function  of  B  the  model  (5.1)  is  a  nonlinear  regression 
model.  Wei  [81]  and  Cao,  Wei  and  Qian  [15]  discussed  the  nonlinear  regression  model  with 
errors  in  ECD  and  gave  an  asymptotic  expansion  and  the  bias,  variance,  and  skewness  of 
the  LSE  by  a  differential  geometry  approach. 

Principal  component  analysis  and  the  canonical  correlation  analysis  are  important 
techniques  of  multivariate  analysis.  When  the  matrix  of  observations  X  is  from  LS, 
the  algebraic  derivations  of  these  two  analyses  axe  the  same  as  before.  However,  the 
corresponding  distribution  theory  and  test  of  hypotheses  may  be  different.  Here  we  need 
to  find  the  distributions  of  eigenvalues  and  eigenvectors  of  X'PX  or  of  X'P\X  with 
respect  to  X'  P2X,  where  P,  Pi,  and  P2  are  positive  definite  matrices.  The  distribution 
of  the  eigenvalues  and  eigenvectors  of  X'PX  for  X  €  SS  were  derived  by  Fang  and  Zhang 
in  Section  3.5.6  of  [58]  and  [17]  which  extended  the  results  for  X  6  VS  of  Anderson 
and  Fang  [2].  From  the  point  of  view  of  spectral  decomposition,  Fang  and  Chen  [43] 
studied  the  spherical  matrix  distribution  and  obtained  some  new  subclasses  of  LS.  Their 
results  can  be  applied  to  principal  component  analysis  in  LS.  As  the  distributions  of  the 
eigenvalues  and  the  eigenvectors  of  X'P\X  with  respect  to  X'P2X  are  invariant  in  the 
class  of  VS,  canonical  correlation  analysis  can  be  used  in  the  class. 

Since  the  distribution  of  the  discriminant  function  is  not  invariant  in  the  class  of  SMD, 
it  is  more  difficult  to  establish  the  theory  of  discriminant  analysis  for  SMD.  Cacoullos  and 
Koutras  [11]  considered  the  minimum-distance  discrimination  for  SD.  Quand,  Fang  and 
Teng  [74]  employed  the  information  function  /(/,  g )  of  /  and  g  defined  by 

I(f,9)  =  J  f(x)log[f(x)/g(xj]dx 

to  discriminant  analysis.  They  proved  that  under  some  conditions  the  information  function 
is  a  monotonic  function  of  the  Mahalanobis  distance. 

There  are  some  studies  of  the  application  of  the  theory  of  ECD  to  econometrics.  Teng 
and  Chen  [78]  and  Teng,  Fang,  and  Deng  [77]  derived  the  distribution  of  the  instrumental 
variable  (IV)  estimator  of  the  coefficients  of  the  endogenous  variables  in  the  simultaneous 
equations  with  spherical  disturbance  and  some  related  distributions.  The  reader  is  referred 
to  Kunitomo  [66]  for  further  results  in  econometrics. 

6.  Symmetric  Multivariate  Distributions. 

Why  does  the  class  of  spherical  distributions  have  so  many  nice  properties?  One  of 
the  reasons  is  its  special  structure  (1.2),  where  U ^  is  common  to  all  members  of  the 
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class.  Therefore,  a  spherical  distribution  is  uniquely  determined  by  the  distribution  of  a 
scalar  variable  i?,  but  many  properties  of  SD  are  independent  of  R  as  mentioned  before. 
That  fact  suggests  finding  other  classes  of  symmetric  multivariate  distributions  having  a 
structure  similar  to  (1.2)  with  beautiful  properties. 

Given  an  n-dimensional  random  vector  Y,  we  may  define  a  corresponding  family  of 
distributions  by 


F{Y)  =  {X  :  X  =  RY,  R  >  0  is  independent  of  Y},  (6.1) 

and  call  Y  the  generating  vector  of  the  family  T(Y).  For  simplicity,  in  this  section  we 
always  assume  P(Y  =  0)  =  0  for  each  generating  vector.  By  choosing  different  Y  we 
obtain  different  classes  of  distributions.  The  following  approach  seems  to  yield  useful 
generating  vectors: 

1)  Take  a  sample  Zi, . . . ,  Z„  from  a  population  with  cdf  F(z). 

2)  Let  Z  =  (Z i , . . . ,  Z„y  and  set  Y  =  (Y\, . . . ,  Yn)'  -with 

Yi  =  Z</||Z||,  t  =  1, . . . ,  n. 

For  example,  if  Zi,...,Zjv  is  from  iV(0, a2 )  and  the  norm  is  defined  as  the  Euclidean 
norm,  then  Y  is  simply  E/(n)  and  F(Y)  is  the  family  of  SD’s.  If  Z\, . . . ,  Zn  are  sampled 
from  an  exponential  distribution  and  the  norm  is  defined  as  the  Li-norm,  then  Y  is  uni¬ 
formly  distributed  on  the  simplex  B„  =  {z  :  z,  >  0,  t  =  1, . . . ,  n,  2»  =  1}  ^{Y) 

is  the  so-called  class  of  multivariate  Li-norm  symmetric  distributions  that  was  defined  and 
studied  by  Fang  and  Fang  [46],  [48],  [33],  [34],  [35].  The  family  F{Y)  retains  most  of  the 
important  properties  of  Z.  Hence,  the  family  of  SD’s  can  be  regarded  as  a  multivariate 
extension  of  N(0, a2)  and  the  family  of  multivariate  Li-norm  symmetric  distributions  as 
a  multivariate  extension  of  the  exponential  distribution.  We  can  use  the  same  technique 
in  studying  these  families.  The  following  table  gives  a  brief  introduction  to  this  kind  of 
multivariate  extensions. 
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Table  1. 


Univariate  distribution 

Its  multivariate  extension 

normal 

spherical 

lognormal 

logspherical 

additive  logistic  normal 

additive  logistic  spherical 

exponential 

multivariate  Li-norm  symmetric 

gamma  &  beta 

multivariate  Liouville 

Cauchy  &  stable  law 

o-symmetric  multivariate 

Cauchy  &  stable  law 

spherical  stable  law 

symmetric  gamma 

generalized  symmetric  Dirichlet 

Scheidegger- Watson 

rotationally  invariant 

Let  W  =  (Wi , . . . ,  Wn)'  be  a  positive  random  vector.  If  log  W  =  (log  W\ , . . . ,  log  Wn)' 
has  an  ECD  we  say  that  W  has  a  logelliptical  distribution.  Let  X  be  a  random  vector  on 
the  simplex  £„_i,  and  let  Y  =  [log(Xi/X„), . . .  ,log(X„_i  /Xnj\ If  Y  has  an  ECD  we 
say  X  has  an  additive  logistic  elliptical  distribution.  These  two  families  were  defined  and 
studied  by  Bentler,  Fang,  and  Wu  [7]  and  Fang,  Bentler,  and  Chou  [41].  The  reader  can 
refer  to  Section  2.8  of  [49]. 

Taking  Y  ~  £>(<*!, . . .  ,<*„),  a  Dirichlet  distribution,  with  the  norm  defined  as  the 
Li-norm,  the  corresponding  family  F(Y)  is  called  the  family  of  multivariate  Liouville 
distributions,  which  can  be  regarded  as  an  extension  of  both  the  gamma  and  the  beta 
distributions  as  well  as  one  of  the  multivariate  Li-norm  symmetric  distributions.  The 
multivariate  Liouville  distributions  have  been  discussed  by  many  authors.  Gupta  and 
Richards  [60]  gave  a  comprehensive  study  of  this  family  under  the  assumption  of  a  density. 
Without  this  assumption  Anderson  and  Fang  [2,3],  and  Fang,  Kotz  and  Ng  [49]  gave  a 
parallel  discussion  with  more  results.  It  is  worth  noting  that  the  structure  (1.2)  can  be 
applied  to  a  nonsymmetric  generating  vector  Y  by  the  same  approach  as  for  symmetric 
generating  vectors. 

There  is  more  than  one  natural  way  to  generalize  a  univariate  distribution  to  its 
multivariate  extension  with  structure  (1.2).  The  c.f.  of  a  stable  law  is 

exp  (  —  A|t|a),  0  <  a  <  2.  (6.2) 

One  may  rewrite  (6.2)  as  exp(— A||$||a),  a  function  of  the  Z^-norm  of  t,  yielding  what  is 
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called  a  spherically  symmetric  stable  law.  A  detailed  discussion  of  this  family  is  given  by 
Zolotarev  [91].  Alternatively,  one  may  consider  (6.2)  as  a  function  of  an  La-norm  of  t 
with  dimension  n  =  1.  This  way  leads  to  the  o-symmetric  multivariate  distribution  that 
was  defined  and  thoroughly  studied  by  Cambanis,  Keener,  and  Simons  [14].  Zhang  [86] 
obtained  the  distribution  of  the  sum  of  squares  of  independent  Cauchy  variables  and  the 
asymptotic  distribution.  Zhang  [87]  generalized  o-symmetric  multivariate  distributions  to 
the  matrix  case  and  found  its  stochastic  decomposition  for  the  case  of  the  matrix  having 
infinite  rows. 

Symmetrizing  the  Dirichlet  distribution  about  the  origin  leads  to  the  symmetrized 
Dirichlet  distribution  (SDD).  Take  Y  having  a  SDD;  then  the  corresponding  family  T(Y) 
is  called  one  of  generalized  Symmetrized  Dirichlet  distributions  which  contains  the  family 
of  SD  as  a  special  case  and  retains  many  properties  of  Z.  When  the  parameters  of  Y  are 
equal,  where  Z  is  sampled  from  a  symmetrized  gamma  distribution  with  the  degrees  of 
freeedom  being  the  same  value  as  the  parameter  of  Y.  Fang  and  Fang  [47]  made  a  thorough 
research  of  this  family. 

Let  V  be  a  linear  subspace  of  Rn  and  let  P„  and  Pvx  be  projection  matrices  into  the 
subspaces  V  and  Vx,  respectively.  In  the  statistics  of  directional  data  the  Scheidegger- 
Watson  distribution  serves  as  the  standard  model  and  is  defined  by  its  density 

(6.3)  g(x'Pvx),  H*||  =  1, 

where  g  is  a  scalar  function.  Fan  [28]  suggested  a  family  of  distributions  whose  densities 
have  the  form  g(x'Pvx,  x' Pvxx)\  the  family  includes  both  the  family  of  ECD  and  the 
family  of  S  —  W  distributions.  Fan  called  it  the  family  of  rotationally  symmetric  dis¬ 
tributions.  Later  Fang  and  Fan  [44]  discussed  asymptotic  properties  of  estimation  and 
hypothesis  testing  for  the  class.  Let  X\, . . .  ,Xn  be  i.i.d.  from  a  rotaitonally  symmetric 
distribution.  They  [32]  found  the  MLE  of  Pv  and  its  maximum  likelihood  characteriza¬ 
tion  which  is  defined  as  follows.  Given  an  intuitive  estimator  for  some  parameters,  we  may 
be  interested  in  finding  the  parent  distributions  such  that  the  estimator  is  the  MLE.  This 
kind  of  problem  is  usually  called  the  maximum  likelihood  characterization  of  the  distribu¬ 
tion. 

Take  Y  having  i.i.d.  components  with  cdf  F(x);  then  the  class  T(Y)  consists  of 
mixtures  of  F(x).  In  the  early  stage  of  the  study  of  SD  researchers  found  may  properties 
of  mixtures  of  normal  distribution.  Later  they  found  that  most  of  those  properties  can  be 
extended  to  the  class  of  SD.  A  natural  question  is  can  we  extend  those  properties  to  some 
wider  class  than  that  of  SD? 
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A  largest  characterization  of  SD  is  a  demonstration  that  there  is  no  generating  vector 
Y  such  that  the  family  of  SD  is  a  proper  subfamily  of  P(Y).  This  was  proved  by  Fang 
and  Bentler  [40].  They  pointed  out  that  this  largest  characterization  can  be  extended  to 
the  family  of  multivariate  Liouville  distributions. 

With  the  Dirichlet  distribution  Fang  and  Xu  [55]  defined  a  class  of  multivariate  dis¬ 
tributions  including  the  multivariate  logistic  and  Gumbel  Type  I  distributions. 

7.  Miscellaneous 

In  this  section  we  include  some  work  not  cited  above.  First  of  all,  we  shall  introduce 
some  characterizations  of  multivariate  symmetric  and  related  distributions. 

Let  X  be  an  n  x  p  random  matrix.  In  general,  the  (marginal)  normality  of  elements, 
rows,  and/or  columns  does  not  imply  the  multinormality  of  X.  Zhang  and  Fang  [88] 
pointed  out  that  the  normality  of  X  can  be  determined  by  the  normality  of  1)  any  element 
of  X  if  X  £  VS;  2)  any  row  of  X  if  X  £  MS;  3)  Xu, , Xpp  if  X  £  SS;  and  4)  the  upper 
triangular  elements  of  X  if  X  £  LS.  They  furthermore  discussed  relationships  between 
the  normality  of  X  and  the  normality  of  linear  transformations  of  X. 

Since  the  order  statistics  of  the  exponential  distribution  have  many  nice  properties, 
Fang  and  Fang  [34]  derived  various  distributions  and  moments  of  the  order  statistics  of  a 
multivariate  Li-norm  symmetric  distribution.  Let  Z  be  an  n-dimensional  interchangeable 
random  vector;  let  Z(j)  <  •  •  •  <  Z(n)  be  its  order  statistics;  and  define  the  normalized 
spacings  of  Z  as  Ui  =  (n  —  i  —  1  )(Z(*)  —  Z(,-_ j),  i  =  1, . . . ,  n,  with  Z(0)  =  0.  It  is  known 
that  Z  =  U  if  Zu...,Zn  are  i.i.d.  and  Z\  has  an  exponential  distribution.  Fang  and  Fang 
[35]  extended  this  property  to  the  class  of  multivariate  Li-norm  symmetric  distributions 
and  gave  the  characterization  that  if  Z  is  an  interchangeable  random  vector,  then  Z  =  U 
if  and  only  if  Z  is  a  multivariate  Xi-norm  symmetric  distribution. 

Let  5  be  a  connected  set  on  the  unit  sphere  in  Rn  and  let  C  be  a  cone  associated  with 
S  defined  by 

C  =  {x  :  x  £  Rn,  x/||x||  6  5}  U  {0}. 

If  X  £  SD  and  P(X  =  0)  =  0,  then  P(X  £  C)  has  the  same  value  for  all  distributions 
in  the  SD.  This  fact  can  be  used  for  a  characterization  of  the  uniform  distribution  on  a 
sphere  and  for  spherical  distributions  if  X  and  X/||X||  are  independent.  The  result  is 
due  to  Wang  [80]  and  referred  to  in  [49],  pp.  163-165. 

Let  X\ , . . . ,  Xn  be  exchangeable  normal  variables  with  a  common  correlation  p,  and 
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let  X(i), . . .  ,X(n)  be  their  order  statistics.  The  random  variable  G  =  X(k)  +  ■  •  •  +  X(n) 
is  called  the  selection  differential  by  geneticists  and  is  of  particular  interest  in  genetic 
selection.  Fang  and  Liang  [50]  gave  results  concerning  a  conjecture  of  Tong  [79]  on  the 
distribution  of  this  random  variable  as  a  function  of  p.  The  same  technique  can  be  applied 
to  yield  general  results  for  linear  combinations  of  order  statistics  of  ECD. 

Let  (f>{x)  and  $(x)  be  the  p.d.f.  and  the  c.d.f.  of  JV(0, 1),  respectively.  Mills’  ratio, 
defined  by 

M(x)  =  [l  -  $(x)]/<?i(x), 

has  been  studied  thoroughly.  One  can  define  similarly  the  Mills’  ratio  M( x,20)  for 
Nn( 0,  E)  and  ECn{ 0, 20,  g).  Fang  and  Xu  [53]  gave  a  detailed  discussion  of  these  Mills’ 
ratios.  They  [84]  obtained  results  on  the  expected  values  of  zonal  polynomials  of  EMD 
also. 


The  inverted  Wishart  distribution  has  been  used  in  Bayesian  statistics.  Many  inverted 
matrix  distributions  related  to  SMD  can  be  similarly  defined.  Xu  [83]  studied  the  inverted 
beta/Dirichlet  distributions  and  gave  some  applications  to  Bayesian  statistics. 


There  are  several  studies  of  the  moments  of  a  multivariate  distribution.  Li  [67]  had  a 
new  approach  on  this  subject.  Let  X  be  an  n  x  1  random  vector.  The  Ar-th  moment  of  X 
is  defined  as 


k{  j  l  E(X  ®X'  ®---®X'  ®X), 


if  k  is  even 
if  k  is  odd. 


Li  gave  the  relationship  between  r*(X)  and  all  the  k- th  mixed  moments  of  X  and  a 
simple  formula  for  the  moments  of  a  quadratic  form  of  X  as  a  function  of  r*(.X).  As  an 
application,  he  gave  moments  of  ECD  and  its  quadratic  forms. 
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